Applications Of A Lexicographical Data Base For German
نویسنده
چکیده
The Institut fHr deutsche Sprache recently has begun setting up a LExicographical DAta Base for German (LEDA). This data base is designed to improve efficiency in the collection, analysis, ordering and description of language material by facilitating access to textual samples within corpora and to word articles, within machine readable dictionaries and by providing a frame to store results of lexicographical research for further processing. LEDA thus consists of the three components Tezt Bank, Diationary Bank and ResuZt Bank and serves as a tool to suppport monolingual German dictionary projects at the Institute and elsewhere. I INTRODUCTORY REMARKS Since the foundation of the Institut fHr deutsche Sprache in 1964, its research has been based on empirical findings; samples of language produced in spoken or written from were the main basis. To handle efficiently large quantities of texts to be researched it was necessary to use a computer, to assemble machine readable corpora and to develop programs for corpus analysis. An outline of the computational activities of the Institute is given in LDV-Info (1981 ff); the basic corpora are described in Teubert (1982). The present main frame computer, which was installed in January 1983, is a Siemens 7.536 with a core storage of 2 megabytes, a number of tape and disc decks and at the moment 15 visual display units for interactive use. Whereas in former years most jobs were carried out in batch, the terminals now make it possible for the linguist to work interactively with the computer. It was therefore a logical step to devise Lexicographical Data Base for German (LEDA) as a tool for the compilation of new dictionaries. The ideology of interactive use demands a different concept of programming where the lexicographer himself can choose from the menu of alternatives offered by the system and fix his own search parameters. Work on the Lexicographical Data Base was begun in 1981; a first version incorporating all three components is planned to be. ready for use in 1986. What is the goal of LEDA? In any lexicographical project, once the concept for the new dictionary has been established, there are three major tasks where the computer can be employed: (i) For each lemma, textual samples have to be determined in the corpus which is the linguistic base of the dictionary. The text corpus and the programs to be applied to it will form one component of LEDA, namely the Text Bank. (ii) For each lemma, the lexicographer will want to compare corpus samples with the respective word articles of existing relevant dictionaries. For easy access, these dictionaries should be transformed into a machine readable corpus of integrated word articles. Word corpus and the pertaining retrieval programs will form the second component, i.e. the Dictionary Bank. (iii) Once the formal structure of the word articles in the new dictionary has been established, description of the lemmata within to the framework of this structure can be begun. A data base system will provide this frame so that homogenous and interrelated descriptions can be carried out by each member of the dictionary team at all stages of the compilation. This component of LEDA we call the Result Bank.
منابع مشابه
QuickLexSort: An efficient algorithm for lexicographically sorting nested restrictions of a database
Lexicographical sorting is a fundamental problem with applications to contingency tables, databases, Bayesian networks, and more. A standard method to lexicographically sort general data is to iteratively use a stable sort – a sort which preserves existing orders. Here we present a new method of lexicographical sorting called QuickLexSort. Whereas a stable sort based lexicographical sorting alg...
متن کاملvernetziko: A Cross-Reference Management Tool for the Lexicographer’s Workbench
vernetziko is an assistive software tool primarily designed for managing cross-references in XML-based electronic dictionaries. In its current form it has been developed as an integral part of the lexicographic editing environment for the German monolingual dictionary elexiko developed and compiled at the Institut für Deutsche Sprache, Mannheim. This paper first briefly outlines how vernetziko ...
متن کاملA Nonlinear Model of Economic Data Related to the German Automobile Industry
Prediction of economic variables is a basic component not only for economic models, but also for many business decisions. But it is difficult to produce accurate predictions in times of economic crises, which cause nonlinear effects in the data. Such evidence appeared in the German automobile industry as a consequence of the financial crisis in 2008/09, which influenced exchange rates and a...
متن کاملTwo variable orthogonal polynomials and structured matrices
We consider bivariate real valued polynomials orthogonal with respect to a positive linear functional. The lexicographical and reverse lexicographical orderings are used to order the monomials. Recurrence formulas are derived between polynomials of different degrees. These formulas link the orthogonal polynomials constructed using the lexicographical ordering with those constructed using the re...
متن کاملLexicographical ordering by spectral moments of trees with a given bipartition
Lexicographic ordering by spectral moments ($S$-order) among all trees is discussed in this paper. For two given positive integers $p$ and $q$ with $pleqslant q$, we denote $mathscr{T}_n^{p, q}={T: T$ is a tree of order $n$ with a $(p, q)$-bipartition}. Furthermore, the last four trees, in the $S$-order, among $mathscr{T}_n^{p, q},(4leqslant pleqslant q)$ are characterized.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1984